On the implementation of the harmonic plus noise model for concatenative speech synthesis
نویسنده
چکیده
In concatenative speech synthesis systems, speech models are usually used to represent the speech signal. Recently, the Harmonic plus Noise Model, HNM, has been proposed for concatenative speech synthesis with promising results. One main drawback of HNM is its complexity. In this paper, we review four different methods of reducing the complexity of HNM. These include, straight-forward synthesis, synthesis using inverse Fast Fourier Transform, synthesis using Recurrence Relations for trigonometric functions, RR, and synthesis based on Delayed Multi-Resampled Cosine functions, DMRC. DMRC was shown to outperform all the other techniques reducing the complexity of HNM synthesizer by 95% compared to the current version of the HNM which is based on the SF method. Informal listening tests showed that the version of HNM based on the DMRC method provides higher quality of speech synthesis than the version based on SF.
منابع مشابه
Applying the harmonic plus noise model in concatenative speech synthesis
This paper describes the application of the harmonic plus noise model (HNM) for concatenative text-to-speech (TTS) synthesis. In the context of HNM, speech signals are represented as a time-varying harmonic component plus a modulated noise component. The decomposition of a speech signal into these two components allows for more natural-sounding modifications of the signal (e.g., by using differ...
متن کاملConcatenative speech synthesis using a harmonic plus noise model
This paper describes the application of the Harmonic plus Noise Model, HNM, for concatenative Text-to-Speech (TTS) synthesis. In the context of HNM, speech signals are represented as a time-varying harmonic component plus a modulated noise component. The decomposition of speech signal in these two components allows for more natural-sounding modi cations (e.g., source and lter modi cations) of t...
متن کاملSynchronization of speech frames based on phase data with application to concatenative speech synthesis
Synchronization of speech frames is an important issue in a concatenative speech synthesis system. In terms of signal processing this is translated in removing linear phase mismatches between concatenated speech frames. This paper presents two novel approaches to the problem of synchronization of speech frames with an application to concatenative speech synthesis. Both methods are based on a pr...
متن کاملTD-PSOLA versus harmonic plus noise model in diphone based speech synthesis
In an effort to select a speech representation for our next generation concatenative text-to-speech synthesizer, the use of two candidates is investigated; TD-PSOLA and the Harmonic plus Noise Model, HNM. A formal listening test has been conducted and the two candidates have been rated regarding intelligibility, naturalness and pleasantness. Ability for database compression and computational lo...
متن کاملA hybrid method oriented to concatenative text-to-speech synthesis
In this paper we present a speech synthesis method for diphonebased text-to-speech systems. Its main goal is to achieve prosodic modifications that result in more natural-sounding synthetic speech. This improvement is especially useful for emotional speech synthesis, which requires high-quality prosodic modification. We present a hybrid method based on TD-PSOLA and the harmonic plus noise model...
متن کامل